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Abstract 

A nr/ 1 ^ approach in molecular design is presented, where in vivo formed complementarity determining regions (CDR) from 
antibody genes were shuffled into a specific framework region. A synthetic gene library of soluble VH-fragments was created and 
the complexity of the library was determined by sequencing. The synthetic genes were diverse and contained random combinations 
of CDR from different germlines. All CDR were randomised in one step and by using in vivo formed CDR, the length, sequence 
and combination were varied simultaneously. © 1998 Elsevier Science B.V. All rights reserved. 
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1. Introduction 

Combinatorial biology provides an efficient way of 
creating large molecular libraries, and it has been 
applied, in particular, to the V-region of antibody genes. 
Since the probability of finding antibodies with a certain 
specificity is higher in an antibody library which contains 
a large number of individual clones, generation of 
diversity is a key element. The target segments for 
introduction of diversity are the complementarity deter- 
mining regions (CDR) of antibody genes and different 
ways have been used, such as PGR amplification of 
V-regions using randomised primers covering the CDR 
(Hoop^ijboom and Winter, 1992; Barbas III et al., 1992; 
Griffiths et al., 1994) or the use of V-regions from in 
vivo immunised donors. Error-prone PCR or bacterial 
mutator strains have also been used to introduce muta- 
tion in a more unrestricted way (for review see Hayden 
et al., 1997). It is also possible to synthetically construct 
the entire antibody V-regions in vitro, using overlapping 
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oligonucleotides subsequently assembled into full-length 
genes. This concept was demonstrated recently with the 
assembly of antibody light-chain gene libraries, using 
totally randomised CDR (Hayashi et al., 1994) or by 
retaining the canonical residues of each CDR while 
randomising the remaining parts of these sequences 
during the in vitro synthesis (ScJderlind et al., 1995), 
which will minimise adverse structural effects. The 
approach using synthetic V-regions assembled in vitro 
has been demonstrated to yield a fully functional anti- 
FITC single-chain antibody fragment based on the 
DP-47 and DPL-3 germline gene sequences (Kobayashi 
et al., 1997), thus paving the way for construction of 
more elaborate synthetic antibody libraries. Similarly, 
the concept of molecular evolution through combination 
of DNA segments between related genes has been 
demonstrated by other laboratories (Crameri and 
Stemmer, 1995; Stemmer, 1994; Zhao et al., 1998). 

In the present study we have developed the concept 
of synthetic antibody design to allow the introduction 
of greater variability into a library, based on the utilisa- 
tion of gene diversity created in viva. This was achieved 
by shuffling CDR isolated in vivo into a single prese- 
lected human framework (master framework), and we 
used a single-domain V H antibody fragment to demon- 
strate this concept. A molecular library of single-domain 
V H antibody fragments was constructed on the DP-47 
master framework (FR) that has been camelised in three 
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positions (Davies and Riechmann, 1995) and using 
CDR 1-3 originating from different V genes formed in 
vivo. This approach of CDR shuffling allowed, for the 
fbrst time, a combination of CDR originating from 
d'irferent antibody genes into one antibody fragment 
gene, while also sampling the entire pool of CDR. An 
enormous diversity can be introduced into V-regions by 
this approach which, consequendy, has the potential for 
creating new binding specificities as well as for evolving 
antibody affinities in vitro in fragments with known 
specificities. 



2. Materials and methods 

2.1. Oligonucleotides and template DNA for the CDR 
amplification 

The DP-47 germline gene was selected as the master 
framework and its sequence was camelised, by mutating 
three residues, according to Davies and Riechmann 
(1995). To build a V H domain by the synthetic approach, 
oligonucleotides were synthesised and purified, as * 
described previously (Soderlind et al., 1995). To shuffle 
the CDR into the master framework, oligonucleotides 
based on the DP-47 germline gene were used in a PCR. 
ror each CDR amplification, an oligonucleotide pair . 
was designed to amplify the CDR as well as to allow 
for one strand of the PCR product to be used in gene 
assembly (Fig. 1). A human cDNA library derived from 
peripheral blood lymphocytes was used as template for 
the CDR. The sequences of the oligonucleotides used in 
this procedure are shown in Table 1 . These primers were 
designed to amplify the CDR, as defined by Kabat et al. 
(1991). In addition, the last codon of FR1 (V H codon 
30) was included in the amplification of CDR1, and the 
first codon of FR3 (V H codon 66) was included in the 
amplification of CDR2. 

2.2. PCR amplification of the CDR 

A cDNA library constructed from peripheral blood 
B cells, producing IgM antibodies, was used as template 
(Ohlin et al., 1996). This library was PCR amplified, 
using primers amplifying intact genes belonging to the 
V H 1, 3, 4 and 5 families, prior to CDR amplification. 
All PCR were performed with reagents and AmpliTaq 
polymerase from Perkin Elmer (Foster City, CA) and 
/hsnnocyclers from Cetus Corporation (Emeryville, 
CA). Each CDR was amplified in a 100 ul reaction, 
containing 200 uM of each dNTP, 1 uM of each primer 
(one of which is biotinylated), 2.5 U AmpliTaq, 
0.1-1 ng V H encoding DNA. The reaction profile used 
was: 94°C for 1 min, 55°C for 1 min, 72 C C for 2 min for 
30 cycles. To remove traces of the original template, the 
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Fig. 1. The principle of CDR shuffling. CDR originating from different 
antibody genes can be randomly assembled into a given framework 
region. (A) Each CDR is amplified in a separate PCR reaction from, 
e.g. a cDNA library. One of the primers in the PCR amplification of 
each CDR is biotinylated. The PCR product is captured on a streptavi- 
din-coated affinity column and single-stranded DNA encoding the 
CDR is prepared by alkali denaturation of the DNA and elution of 
the non-biotinyiated strand. This single-stranded DNA is subsequently 
used in an assembly reaction, where CDR originating from different 
antibody genes will be randomly combined into a given framework 
region. (B) The location of the primers shown in Table 1 . The primers 
V^APl and Vh-AP2 are amplification primers and the primers 
Vh-HI and V H -H4 are fully synthetic internal primers. The internal 
primers V H -H2, -H3 and -H5 are prepared by amplification of naturally 
occurring CDR, as described above. The amplification primers are 
used to increase the copy number of each assembled gene, to introduce 
an N-tenninal FLAG sequence and to provide restriction sites neces- 
sary for cloning. The boxes represent the CDR. 



PCR product was purified from a 2% SeaPlaque low 
melting agarose gel (FMC Bioproducts, Rockland, 
ME), using QIAEX II gel extraction kit (QIAGEN, 
GmbH, Hilden, Germany). 

2.3. Preparation of single-stranded CDR-encoding DNA 

To be able to utilise the PCR amplified CDR in an 
assembly reaction using overlapping oligonucleotides, 
single-stranded DNA was prepared initially. This 
was performed by affinity chromatography on the 
biotinylated strand, using an affini-tip (Genosys 
Biotechnologies, Inc, Pampisford, UK), as outlined in 
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Tabic 1 

The sequences or primers used for the amplification of each CDR, the internal primers used in the assembly reaction and the outermost 
amplification primers used in the final amplification reaction 



Primers for the amplification of the CDR 

5' primer V H -H2; 5'-biotin-GTC CCT GAG ACT CTC CTG TGC AGC CTC TGG ATT CAC CTT T 
3' primer V H -H2: 5'-TCC CTG GAG CCT GGC GGA CCC A 

5' primer V H -H3: 5'-CGC CAG GCT CCA GGG AAG GAG AGG GAG GGG GTC TCA 
3' primer V H -H3: 5'-biotin-GGA ATT GTC TCT GGA GAT GGT GAA 
5' primer V„-H5: 5'-GAG CCG AGG ACA CGG CCG TGT ATT ACT GTG CAA GA 
3' primer V H -H5: 5'-biotin-GCG CTG CTC ACG GTG ACC AGG GTA CCT TGG CCC CA 
The fully synthetic internal primers used in the assembly 
V„-Hl: 5'-GAG GTG CAG CTG TTG GAG TCT GGG GGA GGC TTG GTA CAG CCT GGG GGG TCC CTG AGA CTC TCC TGT 
Vn-H4: 5'-GGC CGT GTC CTC GGC TCT CAG GCT GTT CAT TTG CAG ATA CAG CGT GTT CTT GGA ATT GTC TCT GGA GAT 
GGT 

Amplification primers used in the assembly PCR 
Vh-API: 5'- ACT CGC GGC CCA ACC GGC CAT GGC CGA GGT GCA GCT GTT GGAG 
VAP2: 5'-CAA CTT TCT TGT CGA CTT TAT CAT CAT CAT CTT TAT AAT CGC TGC TCA CGG TGA CCA 



The location of each primer in the assembly reaction is shown in Fig. IB. 



Fig- 1. Briefly, the biotinylated PCR product was affinity 
captured on the streptavidin-coated matrix, and the 
column was washed to remove any remaining template 
DNA. The noa-biotinylated strand was eluted by dena- 
turing the DNA with alkali. The purification was per- 
formed essentially according to the manufacturer's 
instructions, except that all the washing steps using 

1 x binding buffer were performed twice as many times 
as recommended. The eluted single-stranded DNA was 
subsequently used in the assembly reaction. 

2. 4. Assembly of overlapping oligonucleotides into full- 
length Vygenes 

Five overlapping internal oligonucleotides and two 
amplification primers were used in the assembly PCR 
(Soderlind et al., 1995), The three internal oligonucleo- 
tides encoding the CDR were prepared as described 
above, whereas all other oligonucleotides were synthe- 
sised and purified as described previously (Soderlind 
et al., 199S). The amplification primers were included 
to increase the copy number of each assembled gene 
and to provide relevant restriction sites. The genes were 
assisted in 25 \d reaction with 200 of each dNTP, 
3 nM of each internal primer, 0.3 jiM of each amplifica- 
tion primer, 0.625 U AmpliTaq. The reaction profile 
used was: 94°C for 1 min, 55°C for 1 min, 72°C for 

2 min for 30 cycles. The assembled V H genes were puri- 
fied from a 1.5% SeaPlaque low-melting agarose gel 
using QIAEX II (QIAGEN) and re-amplified in 50 |il 
by PCR with primers that were identical to approx. 20 
of the outermost bases of the PCR product. The reaction 
profile used was: 94°C for 1 min, 45 C C for 1 min, 72 C C 
for 2 min for 2 cycles, followed by 10 cycles with the 
profile: 94 6 C for 1 min, 55°C for 1 min, 72°C for 2 min. 



The re-amplified PCR product was gel-purified, as 
described above. 



2. 5. Cloning of the assembled genes 

The assembled, re-amplified and purified genes were 
digested with Ncol and Sail and cloned into the 
phagemid vector pEXmide5 for sequencing. This vector 
was originally constructed by introducing a cloning 
linker (Johansson and Sftderlind, unpublished data) in 
the phage display vector pEXmide4 (Kobayashi et al„ 
1997), The genetic library was electroporated into E. 
coli XL-1 Blue and single colonies were used for plasmid 
preparation, using QIAGEN mini plasmid kit 
(QIAGEN). 



2.5. Sequencing and genetic analysis 

The complexity of the library was determined by 
sequencing both strands of 19 clones with Ready 
Reaction dye-terminator kit (Perkin Elmer), using 
specific primers", The sequences were also determined 
with Ready Reaction dye-primer kit (Perkin Elmer), 
using the Ml 3 reverse primer. Sequence No. 5 was also 
sequenced using the BigDye Terminator Cycle 
Sequencing Ready Reaction Kit (Perkin Elmer). The 
sequencing reactions were analysed using an ABI 377 
automatic DNA sequencer and the sequences were eval- 
uated using the Wisconsin sequence analysis package 
(version 8; Genetics Computer Group Inc., Madison, 
WI), using the FASTA algorithm. The origin of each 
CDR was determined by comparison with known 
variable gene sequences available in the March 1997 
release of the V BASE sequence directory (Cook and 
Tomlinson, 1995). 
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3. Results and discussion 

5.7. Selection of a master framework 

The DP-47 germline gene was used as a master 
fr^i^ework, together with in vivo formed CDR originat- 
ing from different germline genes, for the construction 
of a gene library encoding soluble V H domains. The 
master framework was selected since it folds easily and 
is well expressed in E. coli (Kobayashi et at., 1997), an 
important feature in antibody engineering (Pltickthun 
and Pack, 1997). Primers suitable for this system have 
previously been defined and characterised with respect 
to hetero- and homodimer formation (Kobayashi et al, 
1997). These data were applied in this study. Antibody 
fragments originating from the DP-47 .germline gene are 
frequently selected in phage display systems (Griffiths 
et al., 1994). In addition, it is a germline gene that is 
often used in human immune repertoires (Huang et aL, 
1996; Ohlin and Borrebaeck, 1996), making it a good 
choice if selected clones are to be tolerated in in vivo 
applications. 



3.2. The complexity of the library 

A library of 9 x 10 6 members was created by trans- 
forming electrocompetent XL-1 Blue E. coli. To evaluate 
the complexity of the library, 19 randomly picked clones 
were sequenced. These sequences revealed that the CDR 
shuffled into the DP-47 framework originated from a 
diverse set of V H genes. Also, the CDR1 and CDR2 
found in each assembled gene frequently originated from 
different V H genes, demonstrating that they had been 
combined randomly in each product (Table 2). Even 
though the V-region template was determined by sequen- 
cing to contain genes encoding members of V H L, V H 3 
and V„4 families at a ratio of 3:12:4 (n=19), the 
sequences revealed that all CDR1 and CDR2 found in 
the CDR-shuffled V H -library belonged to the V H 3 family. 
This is the family to which the DP-47 master framework 
belongs, too. The fact that only CDR originating from 
V H genes belonging to the V H 3 family was found is, 
however, a consequence of our specific primer design 
(data not shown) and other V H families have been 
amplified from the same DNA template using a different 



Table 2 

Demonstration of the complexity of the assembled, randomly selected genes" 



CDR I 



CDR2 



Sequence 


Germline 


No. of 


Homology to 


Germline 


No, of 


no. 




mutations 


DP-47 aa 




mutatic 


DP-47 






SS YAMS 






2 


DP-35 


1/18 


. D . Y . . 


DP-42 


0/51 


3 


DP-49, DP-50 


1/18 


. . . G . H 


DP-53 


0/54 


5 


DP-47 


0/18 




DP-51 


7/54 


6 


DP-32 


0/18 


DD.6.. 


DP-47 


0/54 


7 


DP-41 


1/18 


.N.G.H 


DP-47 


0/54 


8 


DP-32 


0/18 


D D . G . . 


DP-77 


0/54 


9 


DP-31 


0/18 


D D . . . H 


DP-47 


3/54 


10 


DP-31 


0/18 


DD . . . H 


DP-31 


3/54 


12 


DP-47 


0/18 




DP-60 


2/54 


13 


DP-49, DP-50 


0/18 


. , . G . H 


DP-35 


4/54 


14 


DP-49, DP-50 


0/18 


. . . G . H 


DP-35 


2/54 


15 


DP-48 


0/18 


. . . D . H 


DP-48 


0/51 


16 


DP-51, DP-77 


0/18 


, ..S.N 


DP-47 


1/54 


17 


DP-35 


4/18 


.D.SID 


DP-31 


0/54 


19 


DP-46, DP-61 


2/18 


IT. . .H 


DP-53 


7/54 


20 


DP-31 


0/18 


D D . . . H 


DP-47 


7/54 


21 


DP-31 


0/18 


0 D . . . H 


DP-77 


1/54 


22 


DP-41 


0/18 


.N.G.H 


DP-35 


0/54 


23 


DP-35 


0/18 


. D. Y . . 


DP-53 


4/54 



Homology to DP-47 aa 



CDR3 

No. 
of aa 



Gene 
assembled 
in frame 



Al SGSGGSTYYADSVKGR 

V . Y- 12 Yes 

R.NSD.S. .S 13 No 

Y . FRISSTV. . . E U Yes 

10 No 

8 No 

S..S.SSYI 12 Yes 

S 12 Yes 

G..UNSV.IV..E 14 No 

G STI 18 No 

Y..S..NTIN E . . 16 No 

Y..S..STI 7 No 

, . G-T A . D. . -PG 13 2/3 No 

V 13 No 

G . . WNS . . I G 10 Yes 

R.NED.SD.N. . . A . . . . 7 No 

S . . S N . R P . . . . 7 Yes 

Y..S.SSYI 25 Yes 

Y..S..STI.... 15 1/3 No 

R.NSD.ST.G. . E 10 Yes 



*The origin of each CDR1 and CDR2, the number of nucleotide differences from the germline sequence and the length of CDR3 were determined. 
Ea$h CDR1 in this evaluation includes codon 30, and each CDR2 includes codon 66, according to Kabat et al. (1991), as these sequences are 
'not encoded by the synthetic primers. Included is also an amino acid (aa) comparison to CDR1 and CDR2 of DP-47 including the FR residues 
encoded by the template. The majority of the frameshift mutations resides in the synthetic primer region, and only in two cases in the CDR 
(sequences 15 and 22). Whenever a CDR showed maximal homology to several germline genes, functional genes described by Cook and 
Tomlinson (1995) were always chosen. Sequences no. 12, and 7 and 22 showed maximum homology to pseudogenes (DP-60 and DP-41, 
respectively). 
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primer design (Ohlin et al., unpublished data). 
Furthe."ifbre ( the amplified CDR exhibited diversity and 
were, in all cases except three (one CDR each in sequence 
No. 7, 12 and 22), encoded by germline genes which are 
considered to be functional (Cook and Tomlinson, 
1995). The fact that we can also amplify CDR from 
pspudogenes allows us to untap an even greater variabil- 
ity, since these particular CDR encode an in-frame 
product. Importantly, the shuffled CDR3 sequences were 
highly diverse in length (Table 2) and sequence (data 
not shown). In general, the length distribution (mean 
length=12.3 aa) was close to lengths observed in anti- 
bodies developed in vivo (Ohlin and Borrebaeck, 1996; 
Wu et al., 1993). Mutations were found in the CDR 
despite the fact that the template used for CDR amplifi- 
cation originates from peripheral blood lymphocytes 
producing IgM antibodies. These observed mutations 
were not typical Tag polymerase-induced errors (Tindall 
and Kunkel, 1988), and their presence is consistent with 
recent findings of somatically mutated Ig-M encoding 
genes (Klein et al., 1997; Brezinschek et al., 1997). 

3.3. Analysis of the assembled V H genes 

Wht=u further analysing the assembled genes, it was 
evident that 8/19 genes contained no frameshift mut- 
ations and encoded a complete product. When analysing 
the 1 1 non-functional genes, the deletion/insertions fre- 
quently occurred in sequences encoded by the synthetic 
oligonucleotides, i.e. primers not containing the CDR. 
In only two cases did the frameshift mutation occur in 
a CDR (sequences 15 and 22) which had frameshift 
mutations in its CDR3, suggesting that they originated 
from genes which had been improperly rearranged in 
vivo. The frameshift mutations residing in the frame- 
work regions were probably introduced during the syn- 
thesis of the synthetic oligonucleotides. In the synthesis 
of longer segments, the frequency of full-length, correctly 
deprotected product can drop dramatically. Despite the 
fact that the oligonucleotides used in the assembly had 
been purified on Oligonucleotide Purification Cartridge 
(Perkin Elmer), some genes still contained these 
deletions/insertions. Refinements in the technology of 
synthesising the oligonucleotides will improve the quality 
of the gene library and, thus, the power of this approach 
even further. Single-stranded DNA prepared from PCR 
products^ seems to have a much lower frequency of 
insertions/deletions, and the use of this type of single- 
stranded DNA will facilitate the assembly of func- 
tional genes. 

3.4. Conclusions 

In conclusion, we present data for a molecular design 
where gene segments formed in vivo can be shuffled into 
preselected framework regions, thus allowing the genera- 



tion of an enormous diversity. This approach does not 
require engineering of restriction sites for the CDR 
shuffling and the variation introduced is the natural 
variation occurring in antibody genes. Our data support 
the proposal that this synthetic approach permits ran- 
domisation of all three CDR in one step. By using in 
vivo formed CDR, the length, sequence and combination 
of CDR can be varied at the same time. Thus, it does 
not require the design and synthesis of mutated synthetic 
oligonucleotides (Crameri and Sternmer, 1995; Virnekas 
et al., 1994), and by varying the CDR primers and the 
template source, it will also be possible to modify the 
complexity of the library. This approach will not only 
be applicable in antibody engineering, but also in the 
engineering^ other proteins, where functional segments 
can be shuffled to obtain a protein with improved 
qualities. 
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